Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction

نویسندگان

  • Chanwoo Kim
  • Richard M. Stern
چکیده

This paper presents a new feature extraction algorithm called PNCC that is based on auditory. Major new features of PNCC processing include the use of a power-law nonlinearity that replaces the traditional log nonlinearity used in MFCC coefficients, and a novel algorithm to suppress background excitation using medium-duration power estimation based on the ratio of the arithmetic mean to the geometric mean, and subtracting the medium-duration background power. Experimental results demonstrate that the PNCC processing provides substantial improvements in recognition accuracy compared to MFCC and PLP processing for various types of additive noise. The computational cost of PNCC is only slightly greater than that of conventional MFCC processing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Regularized MVDR spectrum estimation-based robust feature extractors for speech recognition

In this paper, we present two robust feature extractors that use a regularized minimum variance distortionless response (RMVDR) spectrum estimator instead of the discrete Fourier transform-based direct spectrum estimator, used in many front-ends including the conventional MFCC, for estimating the speech power spectrum. Direct spectrum estimators, e.g., single tapered periodogram, have high vari...

متن کامل

Modified Mfcc Methods Based on Kl- Transform and Power Law for Robust Speech Recognition

This paper presents robust feature extraction techniques, called Mel Power Karhunen Loeve Transform Coefficients (MPKC), Mel Power Coefficients (MPC) for an isolated digit recognition. This hybrid method involves Stevens’ Power Law of Hearing and Karhunen Loeve(KL) Transform to improve noise robustness. We have evaluated the proposed methods on a Hidden Markov Model (HMM) based isolated digit r...

متن کامل

Noise Robust Speaker-Independent Speech Recognition with Invariant-Integration Features Using Power-Bias Subtraction

This paper presents new results about the robustness of invariantintegration features (IIF) in noisy conditions. Furthermore, it is shown that a feature-enhancement method known as “powerbias subtraction” for noisy conditions can be combined with the IIF approach to improve its performance in noisy environments while keeping the robustness of the IIFs to mismatching vocaltract length training-t...

متن کامل

Smoothed Nonlinear Energy Operator-Based Amplitude Modulation Features for Robust Speech Recognition

In this paper we present a robust feature extractor that includes the use of a smoothed nonlinear energy operator (SNEO)-based amplitude modulation features for a large vocabulary continuous speech recognition (LVCSR) task. SNEO estimates the energy required to produce the AM-FM signal, and then the estimated energy is separated into its amplitude and frequency components using an energy separa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009